import pandas as pd
import altair as alt
This week's blog will be about the importance of visualizations. I will take a bad example of a chart, in this case, a pie chart released by Fox news in 2012 showing Vice President candidates, and create a new chart - that is better overall. I will use Altair, which is another package that should be in your tool belt as a data scientist.
This is the lousy graph below. You should be able to see a few things wrong with it. First off, pie charts are the worst way to represent data because it is hard to see which piece is more significant than the other pieces. Palin has a 10% lead over Romney but, it is hard to view it.
These first steps are just recreating the data.
fox_data = {
'Palin': 70,
'Huckabee': 63,
'Romney': 60
}
df_fox = pd.DataFrame(list(fox_data.items()))
df_fox
0 | 1 | |
---|---|---|
0 | Palin | 70 |
1 | Huckabee | 63 |
2 | Romney | 60 |
df_fox.columns = ['GOP Nominee','Percentage Backing']
Now, let's get rid of the pie chart and replace it with a bar graph.
(
alt.Chart(df_fox)
.mark_bar()
.encode(
x='GOP Nominee',
y='Percentage Backing',)
.properties(height=200, width=200)
)
Then lets add some color based on each category on our X axis.
(
alt.Chart(df_fox)
.mark_bar()
.encode(
x='GOP Nominee',
y='Percentage Backing',
color='GOP Nominee') ## This is the added code
.properties(height=200, width=200)
)
Now, let's change the titles on each axis to make the information more digestible.
(
alt.Chart(df_fox)
.mark_bar()
.encode(
x=alt.X('GOP Nominee', axis=alt.Axis(title='Nominee')), ## This is the added code
y=alt.Y('Percentage Backing', axis=alt.Axis(title='Percent')), ## This is the added code
color='GOP Nominee')
.properties(height=200, width=200)
)
Lastly, lets add a title to give the chart an overall theme.
(
alt.Chart(df_fox)
.mark_bar()
.encode(
x=alt.X('GOP Nominee', axis=alt.Axis(title='Nominee')),
y=alt.Y('Percentage Backing', axis=alt.Axis(title='Percent')),
color='GOP Nominee')
.properties(height=200, width=200, title='Vice-Presidential Choice') ## This is the added code
)
As you can see now, this graph is much cleaner looking, more comfortable to read, and overall better than it was before. In conclusion, Altair is an easy to use Python library that makes good looking images with just a few lines of codes.